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Abstract — Concurrently estimating the 6-DOF pose of multiple 
cameras or robots — cooperative localization — is a core problem 
in contemporary robotics. Current works focus on a set of 
mutually observable world landmarks and often require inbuilt 
egomotion estimates; situations in which both assumptions are 
violated often arise, for example, robots with erroneous low 
quality odometry and IMU exploring an unknown environment. 
In contrast to these existing works in cooperative localization, we 
propose a cooperative localization method, which we call mutual 
localization, that uses reciprocal observations of camera-fiducials 
to obviate the need for egomotion estimates and mutually ob- 
servable world landmarks. We formulate and solve an algebraic 
formulation for the pose of the two camera mutual localization 
setup under these assumptions. Our experiments demonstrate the 
capabilities of our proposal egomotion-free cooperative localiza- 
tion method: for example, the method achieves 2cm range and 0.7 
degree accuracy at 2m sensing for 6-DOF pose. To demonstrate 
the applicability of the proposed work, we deploy our method 
on Turtlebots and we compare our results with ARToolKit |1J 
and Bundler |2|, over which our method achieves a tenfold 
improvement in translation estimation accuracy. 

I. Introduction 

Cooperative localization is the problem of finding the rel- 
ative 6-DOF pose between robots using sensors from more 
than one robot. Various strategies involving different sensors 
have been used to solve this problem. For example, Cognetti 
et al. |3|, ||4| use multiple bearning-only observations with a 
motion detector to solve for cooperative localization among 
multiple anonymous robots. Trawny et al. |5 1 and lately Zhou 
et al. ||6|, ||7| provide a comprehensive mathematical analysis 
of solving cooperative localization for different cases of sensor 
data availability. Section [ll] covers related literature in more 
detail. 

To the best of our knowledge, all other cooperative localiza- 
tion works (see Section require estimation of egomotion. 
However, a dependency on egomotion is a limitation for sys- 
tems that do not have gyroscopes or accelerometers, which can 
provide displacement between two successive observations. 
Visual egomotion, like MonoSLAM |8|, using distinctive im- 
age features estimates requires high quality correspondences, 
which remains a challenge in machine vision, especially in 
cases of non-textured environments. Moreover, visual egomo- 
tion techniques are only correct upto a scale factor. Contempo- 
rary cooperative localization methods that use egomotion 
|[6|, |[9| yield best results only with motion perpendicular to 
the direction of mutual observation and fails to produce results 
when either observer undergoes pure rotation or motion in 
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Figure 1: Simplified diagram for the two-camera problem. 
Assuming the length of respective rays to be Si, 52, 53, 54 
respectively, each marker coordinates can be written in both 
coordinate frames {p} and {q}. For example Mi is sipi in 
frame {p} and qi in {q}, where pi unit vector parallel to pi. 



the direction of observation. Consequently, in simple robots 
like Turtlebot, this technique produces poor results because 
of absence of sideways motion that require omni-directional 
wheels. 

To obviate the need for egomotion, we propose a method for 
relative pose estimation that leverages distance between fidu- 
cial markers mounted on robots for resolving scale ambiguity. 
Our method, which we call mutual localization, depends upon 
the simultaneous mutual/reciprocal observation of bearing- 
only sensors. Each sensor is outfitted with fiducial markers 
(Fig. [T]) whose position within the host sensor coordinate sys- 
tem is known, in contrast to assumptions in earlier works that 
multiple world landmarks would be concurrently observable 
by each sensor |10|. Since our method does not depend on 
egomotion, hence it is instantaneous, which means it is robust 
to false negatives and it do not suffers from the errors in 
egomotion estimation. 

The main contribution of our work is a generalization 
of Perspective-3 -Points (P3P) problem where observer and 
the observed points are distributed in different reference 
frames unlike conventional approach where observer's refer- 
ence frame do not contain any observed points and vice versa. 
In this paper we present an algebraic derivation to solve for 
the relative camera pose (rotation and translation) of the two 
bearing-only sensors in the case that each can observe two 
known fiducial points in the other sensor; essentially giving 
an algebraic system to compute the relative pose from four 
correspondences (only three are required in our algorithm 
but we show how the fourth correspondence can be used 
to generate a set of hypothesis solutions from which best 
solution can be chosen). Two fiducial points on each robot 
(providing four correspondences) are preferable to one on one 
and two on the other, as it allows extension to multi-robot 
(> 2) systems ensuring that any pair of similarly equipped 
robots can estimate their relative pose. In this paper, we focus 



on only two robot case as an extension to multi-robot case as 
pairwise localization is trivial yet practically effective. 

Our derivation, although inspired by the linear pose es- 
timation method of Quan and Lan |11|, is novel since all 
relevant past works we know on P3P problem 1 12], assume all 
observations are made in one coordinate frame and observed 
points in the other. In contrast, our method makes no such as- 
sumption and concurrently solves the pose estimation problem 
for landmarks sensed in camera- specific coordinates frames. 

We demonstrate the effectiveness of our method, by analyz- 
ing its accuracy in both synthetic, which affords quantitative 
absolute assessment, and real localization situations by deploy- 
ment on Turtlebots. We use 3D reconstruction experiments 
to show the accuracy of our algorithm. Our experiments 
demonstrate the effectiveness of the proposed approach. 

II. Related Work 

Cooperative localization has been extensively studied and 
applied to various applications. One of the latest works in this 
area comes from Cognetti et al. fS], (4] where they focus 
on the problem of cooperatively localizing multiple robots 
anonymously. They use multiple bearing-only observations 
and a motion detector to localize the robots. The robot detector 
is a simple feature extractor that detects vertical cardboard 
squares mounted atop each robot in the shadow zone of the 
range finder. One of oldest works come from Karazume et. 
al. 1 13 1 where they focus on using cooperative localization 
as a substitute to dead reckoning by suggesting a "dance" in 
which robots act as mobile landmarks. Although they do not 
use egomotion, but instead assume that position of two robots 
are known while localizing the third robot. Table |T| summarizes 
a few closely related works with emphasis on how our work 
is different different from each of them. Rest of the section 
discusses those in detail. 

Howard et al. 1 14] coined the CLAM (Cooperative Localiza- 
tion and Mapping) where they concluded that as an observer 
robot observes the explorer robot, it improves the localization 
of robots by the new constraints of observer to explorer 
distance. Recognizing that odometry errors can cumulate over 
time, they suggest using constraints based on cooperative 
localization to refine the position estimates. Their approach, 
however, do not utilizes the merits of mutual observation as 
they propose that one robot explores the world and other 
robot watches. We show in our experiments, by comparison 
to ARToolKit 1 1 1 and Bundler |2|, that mutual observations of 
robots can be up to 10 times more accurate than observations 
by single robot. 

A number of groups have considered cooperative vision 
and laser based mapping in outdoor environments |T5| , |T6| 
and vision only y/TJ, Localization and mapping using 
heterogeneous robot teams with sonar sensors is examined 
extensively by |T9| , | [20| . Using more than one robot enables 
easier identification of previously mapped locations, simplify- 
ing the loop-closing problem [21 ]. 

Fox et al. | [22| propose cooperative localization based on 
Monte-Carlo localization technique. The method uses odome- 
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Tag 
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meaning 



Without Ego-Motion. All those works that use egomo- 
tion are marked as X. 

Localization using bearing only measurements. No 
depth measurements required. All those works that 
require depth measurements are marked with X. 
SLAM like tight coupling. Inaccuracy in mapping 
leads to cumulating interdependent errors in localiza- 
tion and mapping. All those works that use SLAM like 
approach are marked with a X 

Utilizes mutual observation, which is more accurate 
than one-sided observations. All those works that do 
not use mutual observation, and depend on one-sided 
observations are marked as X 



Table I: Comparison of related work with Mutual localization 



try measurements for ego motion. Chang et al. p3| uses depth 
and visual sensors to localize Nao robots in the 2D ground 
plane. Roumeliotis and Bekey |24| focus on sharing sensor 
data across robots, employing as many sensors as possible 
which include odometry and range sensors. Rekleitis et al. 
| 25 1 provide a model of robots moving in 2D equipped with 
both distance and bearing sensors. 

Zou and Tan |T0| proposed a cooperative simultaneous lo- 
calization and mapping method, CoSLAM, in which multiple 
robots concurrently observe the same scene. Correspondences 
in time (for each robot) and across robots are fed into an 
extended Kalman filter and used to simultaneously solve the 
localization and mapping problem. However, this and other 
"co-slam" approaches such as |26 | remain limited due to the 
interdependence of localization and mapping variables: errors 
in the map are propagated to localization and vice versa. 

Recently Zhou and Roumeliotis |[6|, Q have published 
solution of a set of 14 minimal solutions that covers a wide 
range of robot to robot measurements. However, they use 
egomotion for their derivation and they assume that observable 
fiducial markers coincide with the optical center of the camera. 
Our work does not make any of the two assumptions. 

III. Problem Formulation 

We use the following notation in this paper, see Fig. [T] 
Cp and Cq represent two robots, each with a camera as a 
sensor. The corresponding coordinate frames are {p} and {q} 
respectively with origin at the optical center of the camera. 
Fiducial markers Mi and M2 are fixed on robot Cq and 
hence their positions are known in frame {g} as qi, q2 G M^. 
Similarly, P3,P4 G are the positions of markers M3 
and M4 in coordinate frame {p}. Robots are positioned such 
that they can observe each others markers in their respective 
camera sensors. The 2D image coordinates of the markers Mi 



and M2 in the image captured by the camera {p} are measured 
as pi,P2 ^ and that of M3 and M4 is q3,q4 G 
in camera {q}. Let Kp^Kq G R^^^ be the intrinsic camera 
matrices of the respective camera sensors on robot Cp^Cq. 
Also, note the superscript notation. 2D image coordinates are 
denoted by a bar, example p. Unit vectors that provide bearing 
information are denoted by a caret, example p. 

Since the real life images are noisy, the measured image 
positions p^ and will differ from the actual positions p^o 
and q^o by gaussian noise r]i. 



Pi 



Pii 

^ qii 



Vqi 



yi e {1,2} 
Vz e {3,4} 



(1) 

(2) 



The problem is to determine the rotation R G R^^^ and 
translation t G R^ from frame {p} to frame {q} such that 
any point p^ in frame {p} is related to its corresponding point 
q^ in frame {q} by the following equation. 



Rpi 



(3) 



The actual projections of markers into the camera image 
frames of the other robot are governed by following equations, 

P^o =f{KpR-\q,-t)) VzG{l,2} (4) 
q,o =f{Kq{Rp,^t)) ViG{3,4} (5) 

where / is the projection function defined over a vector 

V = [vx,Vy,Vz] as 



/(v) 



(6) 



To minimize the effect of noise we must compute the optimal 
transformation, i?* and t*. 



(i?*, t*) = arg min 



\\pi-f{KpR-\cii-t))\\' 



ue{i,2} 



ie{3,4} 



f{K,{Rpi+t))f\ (7) 



To solve this system of equations we start with exact 
equations that lead to a large number of polynomial roots. 
To choose the best root among the set of roots we use the 
above minimization criteria. 

Let Pi, qi G R^ be the unit vectors drawn from the camera's 
optical center to the image projection of the markers. The 
unit vectors can be computed from the position of markers in 
camera images Pi,qi by the following equations. 

P^ = \ T VzG{l,2} (8) 

\\K^'[pjA] II 
q. = \\t\\- Vzg{3,4} (9) 

ii^."'[qz^,i] II 

Further let si, 52 be the distances of markers Mi, M2 from 
the optical center of the camera sensor in robot Cp. And S3, 
54 be the distances of markers M3, M4 from the optical center 



of camera sensor in robot Cq. Then the points qi, q2, S3q3, 
S4q4 in coordinate frame {q} correspond to the points sipi, 
P3, P4 in coordinate frame {p}. 



qi = t + siRpi 

q2 = t + S2i^P2 
53q3 = t + Rps 
S4q4 = t -\- i?p4 



(10) 



These four vector equations provide us 12 constraints (three 
for each coordinate in 3D) for our 10 unknowns (3 for rotation 
R, 3 for translation t, and 4 for Si). We first consider only the 
first three equations, which allows an exact algebraic solution 
of the nine unknowns from the nine constraints. 

Our approach to solving the system is inspired by the well 
studied problem of Perspective- 3 -points fTT], also known as 
space resection 1 11 1. However, note that the method cannot be 
directly applied to our problem as known points are distributed 
in both coordinate frames as opposed to the space resection 
problem where all the known points are in the one coordinate 
frame. 

The basic flow steps of our approach are to first solve for the 



three range factors, 5i, 52 and S3 (Section IH-A). Then we set 
up a classical absolute orientation system on the rotation and 
translation (Section |in-B| ), which is solved using established 
methods such as Arun et al. I^TJ or Horn |28|; finally, since our 
algebraic solution will give rise to many candidate roots, we 
develop a root-filtering approach to determine the best solution 
(Section [Iircl ). 



A. Solving for si, S2 and 53 

The first step is to solve the system for 5i, 52 and S3. We 
eliminate R and t by considering the inter-point distances in 
both coordinate frames. 



l^iPi - S2P2I 
\\s2P2 - Pal 
IIps - 5ipi| 



l|qi -q2ii 

||q2 - 53q3| 

||53q3 - qil 



(11) 



Squaring both sides and representing the vector norm as 
the dot product gives the following system of polynomial 
equations. 



sl^sl- 2si52p7p2 - ||qi - q2||^ = 

sl-sj- 252pjp3 + 253qjq3 + Hps IP - ||q2 
4-4- 2sip7p3 + 2s3q7q3 + ||P3|P - ||qi 



(12a) 
(12b) 
(12c) 



This system has three quadratic equations implying a Bezout 
bound of eight (2^) solutions. Using the Sylvester resultant we 
sequentially eliminate variables from each equation. Rewriting 



( |12a| ) and ( |12b| ) as quadratics in terms of S2 gives 

si + (-2sip7p2) S2 + (si - \qi - 92p) = 



si + (-2pjp3) S2 - {si - 2s3qjq3 - IIpsII^ + ||q2 



(13) 
f)=0 



(15) 



bi bo 

(14) 

The Sylvester determinant (29^, p. 123] of ([13]) and ([14]) is given 
by the determinant of the matrix formed by the coefficients of 

1 ai ao 
. . 1 ai ao 
"(^^'^^)= 1 6x 6o 
1 6i 6o 

This determinant is a quartic function in si, 53. By definition 
of resultant, the resultant is zero if and only if the parent 
equations have at least a common root |(29| . Thus we have 
eliminated variable 52 from ( [12a| ) and ( |12b| . We can repeat 
the process for eliminating 53 by rewriting r(5i, S3) and ( |12c| ) 
as: 

^(^l, 53) = C453 + C353 + C2sl + Ci53 + Co = 



2sip7p3 ■ 



|P3| 



|qil 







(16) 



The Sylvester determinant of ([16]) would be 



^2(51) = 
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Solving ( [TT] ) gives an 8 degree polynomial in si. By Abel- 
Ruffini theorem |[30j p. 131], a closed-form solution of the 
above polynomial does not exist. 

The numeric solution to ( [TT] ) gives eight roots for S3. We 
compute si and 52 using ( [12c| ) and ( |12b| ) respectively. Because 
the camera cannot see objects behind it, only real positive roots 
are maintained from the resultant solution set. 

B. Solving for R and t 

With the solutions for the scale factors, {si, 52, 53} we can 
compute the absolute location of the Markers {Mi, M2, M3} 
in both the frames {p} and {q}. 

Pi = 5iPi ViG{l,2} 
qi = SiCii yi e {3} 

These exact correspondences give rise to the classical problem 
of absolute orientation i.e. given three points in two coordinate 
frames find the relative rotation and translation between the 
frames. For each positive root of si, 52, 53 we use the method 
in Arun et. al p7| method (similar to Horn's method p8| ) to 
compute the corresponding rotation R and translation value t. 



C. Choosing the optimal root 

Completing squares in (V2) yields important information 
about redundant roots. 

{si + S2f - 2siS2{l + pjp2) - Hi - q2||' = (18a) 

{s2 - P2Psf - {s3 - qjqa)^ 

+ (P3 - P2)"^P3 - (q2 - qs) = 

(18b) 



(^1 - p^Ps)^ - {ss - q^qs)^ 



+ (P3 - Pi)"^P3 - q7(qi - qs) = 



(18c) 



Equations ( [TS] ) do not put any constraints on positivity of 
terms (s2-P2"p3), (^s-qjqs), (^i-p^Ps) or {ss-qjqs). 
However, all these terms are positive as long as the markers 
of the observed robot are farther from the camera than the 
markers of the observing robot. Also, the distances Si are 
assumed to be positive. Assuming the above, we filter the real 
roots by the following criteria: 



^1 > IIpsII 
^2 > IIpsII 

53 > max(||qi||, ||q2| 



(19) 
(20) 
(21) 



These criteria not only reduce the number of roots signifi- 
cantly, but also filter out certain degenerate cases. 

For all the filtered roots of ( [TT] ), we compute the correspond- 
ing values of R and t, choosing the best root that minimizes 
the error function, ([7]). 

D. Extension to more than three markers 

Even though the system is solvable by only three markers, 
we choose to use four markers for symmetry. We can fall back 
to the three marker solution in situations when one of the 
markers is occluded. Once we extend this system to 4 marker 
points, we obtain 6 bivariate quadratic equations instead of the 
three in ([12]) that can be reduced to three 8-degree univariate 
polynomials. The approach to finding the root with the least 
error is the same as described above. 

The problem of finding relative pose from five or more 
markers is better addressed by solving for the homography 
when two cameras observe the same set of points as done by 
l[3ll-|[34|. The difference for us is that the distance between 
the points in both coordinate frames is known hence we can 
estimate the translation metrically which is not the case in 
classical homography estimation. Assuming the setup for five 
points such that ( [TO] ) becomes 

qi = t + siRpi 

q2 = t + S2RP2 
53qs = t^Rp3 (22) 
54q4 = t-\- Rp4 

= t^ Rp^ 




Figure 2: The deployment of markers on Turtlebot that we 
used for our experiments 



If the essential matrix is E, the setup is the same as solving 
for 

[qi,q2,q3,q4,q5]^^[Pl,P2,P3,P4,P5] = (23) 

The scale ambiguity of the problem can be resolved by one of 
the distance relations from (U) . Please refer to 1 32 1 for solving 
( [23] ). For more points refer to |35| for the widely known 7- 
point and linear 8 -point algorithms. 

IV. Implementation 

We implement our algorithm on two Turtlebots with fiducial 
markers. One of the Turtlebots with markers is shown in 
Fig. |2] We have implemented the algorithm in Python using 
the Sympy OpenCV |[37) and Numpy |[38) Hbraries. As 
the implementing software formulates and solves polynomials 
symbolically, it is generic enough to handle any reasonable 
number of points in two camera coordinate frames. We have 
tested the solver for the following combination of points: 0-3, 
1-2, 2-2, where 1-2 means that 1 point is known in the first 
coordinate frame and 2 points are known in the second. 

We use blinking lights as fiducial markers on the robots and 
barcode-like cylindrical markers as for the 3D reconstruction 
experiment. 

The detection of blinking lights follows a simple thresh- 
olding strategy on the time differential of images. This ap- 
proach coupled with decaying confidence tracking produces 
satisfactory results for simple motion of robots and relatively 
static backgrounds. Fig. [3] shows the cameras mounted with 
blinking lights as fiducial markers. The robots shown in |3] 
are also mounted with ARToolKit 1 1 1 fiducial markers for the 
comparison experiments. 

V. Experiments 

To assess the accuracy of our method we perform a lo- 
calization experiment in which we measure how accurately 
our method can determine the pose of the other camera. We 





Median Trans, error 


Median Rotation error 


ARToolKit 1 1 1 


0.57m 


9.2° 


Bundler | 2 | 


0.20m 


0.016° 


Mutual Localization 


0.016m 


0.33° 



Table II: Table showing mean translation and rotation error for 
ARToolKit, Bundler and Mutual Localization 



compare our localization results with the widely used fiducial- 
based pose estimation in ARToolKit |1J and visual egomotion 
and SfM framework Bundler ||2|. We also generate a semi- 
dense reconstruction to compare the mapping accuracy of our 
method to that of Bundler. A good quality reconstruction, is 
a measure of the accuracy of mutual localization of the two 
cameras used in the reconstruction. 

A. Localization Experiment 

a) Setup: Two turtlebots were set up to face each other. 
One of the turtlebot was kept stationary and the other moved 
in 1 ft increments in an X-Z plane (Y-axis is down, Z-axis 
is along the optical axis of the static camera and the X- 
axis is towards the right of the static camera). We calculate 
the rotation error by extracting the rotation angle from the 
differential rotation Rj^Rest as follows: 

180 f TT{Rj,R,st)-l \ 
Ee = ^ arccos I ^ I (24) 

where Rgt is the ground truth rotation matrix. Rest is the 
estimated rotation matrix and Tr is the matrix trace. The 
translation error is simply the norm difference between two 
translation vectors. 

b) Results in comparison with ARToolKit fT]: The AR- 
ToolKit is an open source library for detecting and determining 
the pose of fiducial markers from video. We use a ROS p9| 
wrapper - ar _j)ose - over ARToolKit for our experiments. 
We repeat the relative camera localization experiment with 
the ARToolKit library and compare to our results. The results 
show a tenfold improvement in translation error over Bundler 
|2|. 

B. Simulation experiments with noise 

A simple scene was constructed in Blender to verify the 
mathematical correctness of the method. Two cameras were 
set up in the blender scene along with a target object Im from 
the static camera. Camera images were rendered at a resolution 
of 960 X 540. The markers were simulated as colored balls 
that were detected by simple hue based thresholding. The two 
cameras in the simulated scene were rotated and translated to 
cover maximum range of motion. After detection of the center 
of the colored balls, zero mean gaussian noise was added to 
the detected positions to investigate the noise characteristics 
of our method. The experiment was repeated with different 
values of noise covariance. Fig |6] shows the translation and 
rotation error in the experiment with variation in noise. It can 
be seen that our method is robust to noise as it deviates only 
by 5cm and 2.5° when tested with noise of up to 10 pixels. 




Figure 3: Diagram of the two camera setup for mutual localization 3D metric reconstruction, along with images from each 
camera for two poses of the mobile camera. Cameras have distinctive cylindrical barcode-like markers to aid detection in each 
others image frames. Also depicted is the triangulation to two example feature points. 
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Figure 4: Translation error comparison between the ARToolKit 
and our mutual localization. The translation error is plotted to 
ground truth X and Z axis positions to show how error varies 
with depth (Z) and lateral (X) movements. We get better results 
in localization by a factor of ten. Also note how the translation 
error increases with Z-axis (inter-camera separation). 
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Figure 5: Rotation error comparison between the ARToolKit 
and Mutual localization. Rotation error decreases with Z-axis 
(ground truth inter-camera separation). See ([24]) for computa- 
tion of rotation error. 



C. 3D Reconstruction experiment 

The position and orientation obtained from our method 
is inputted into the patch based multi-view stereo (PMVS- 
2) library |40j| to obtain a semi-dense reconstruction of an 




Noise (pixels) 

Figure 6: Rotation and translation error as noise is incremen- 
tally added to the detection of markers. 



indoor environment. Our reconstruction is less noisy when 
compared to that obtained by Bundler |2|. Fig. [7] shows a side- 
by-side snapshot of the semi-dense map from Bundler-PMVS 
and, our method, Mutual Localization-PMVS. To compare the 
reconstruction accuracy, we captured the scene as a point 
cloud with an RGB-D camera (Asus-Xtion). The Bundler 
and Mutual Localization output point clouds were manually 
aligned (and scaled) to the Asus-Xtion point cloud. We then 
computed the nearest neighbor distance from each point in 
the Bundler/Mutual localization point clouds discarding points 
with nearest neighbors further than Im as outliers. With this 
metric the mean nearest neighbor distance for our method was 
0.176m while that for Bundler was 0.331m. 

VI. Conclusion 

We have developed a method to cooperatively localize two 
cameras using fiducial markers on the cameras in sensor- 
specific coordinate frames, obviating the common assumption 
of sensor egomotion. We have compared our results with the 
ARToolKit showing that our method can localize significantly 
more accurately, with a tenfold error reduction observed in our 
experiments. We have also demonstrated how the cooperative 
localization can be used as an input for 3D reconstruction of 
unknown environments, and find better accuracy (0.18m versus 
0.33m) than the visual egomotion-based Bundler method. We 
plan to build on this work and apply it to multiple robots for 
cooperative mapping. Though we achieve reasonable accuracy, 
we believe we can improve the accuracy of our method by 
improving camera calibration and measurement of the fiducial 
marker locations with respect to the camera optical center. 



We will release the source code (open-source) for our method 
upon publication. 
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